DELAY-CFIM: A Sliding Window Based Method on Mining Closed Frequent Itemsets over High-Speed Data Streams

نویسندگان

  • Chunkai Zhang
  • Yulong Hu
  • Lei Zhang
چکیده

Closed frequent itemset mining plays an essential role in data stream mining. It could be used in business decisions, basket analysis, etc. Most methods for mining closed frequent itemsets store the streamlined information in compact data structure when data is generated. Whenever a query is submitted, it outputs all closed frequent itemsets. However, the online processing of existing approaches is so slow that those methods cannot deal with data streams generated at a high speed. In this paper, a novel method DELAY-CFIM for mining closed frequent itemsets is proposed to solve the problem of slow online processing. It divides the closed frequent itemset mining process over data streams into two steps. Firstly, when transactions are generated, it stores the frequency information of itemsets in a summary data structure. Then it mines closed frequent itemsets until a query is submitted. The method can improve the speed of online processing. Keywordsclosed frequent itemset, data stream, sliding window.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining frequent itemsets over data streams using efficient window sliding techniques

Online mining of frequent itemsets over a stream sliding window is one of the most important problems in stream data mining with broad applications. It is also a difficult issue since the streaming data possess some challenging characteristics, such as unknown or unbound size, possibly a very fast arrival rate, inability to backtrack over previously arrived transactions, and a lack of system co...

متن کامل

Incremental updates of closed frequent itemsets over continuous data streams

Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the...

متن کامل

An Efficient Incremental Algorithm to Mine Closed Frequent Itemsets over Data Streams

The purpose of this work is to mine closed frequent itemsets from transactional data streams using a sliding window model. An efficient algorithm IMCFI is proposed for Incremental Mining of Closed Frequent Itemsets from a transactional data stream. The proposed algorithm IMCFI uses a data structure called INdexed Tree(INT) similar to NewCET used in NewMoment[5]. INT contains an index table Item...

متن کامل

An Efficient Mining Algorithm by Bit Vector Table for Frequent Closed Itemsets

Mining frequent closed itemsets in data streams is an important task in stream data mining. In this paper, an efficient mining algorithm (denoted as EMAFCI) for frequent closed itemsets in data stream is proposed. The algorithm is based on the sliding window model, and uses a Bit Vector Table (denoted as BVTable) where the transactions and itemsets are recorded by the column and row vectors res...

متن کامل

Mining Top-k Frequent Closed Itemsets in Data Streams Using Sliding Window

Frequent itemset mining has become a popular research area in data mining community since the last few years. There are two main technical hitches while finding frequent itemsets. First, to provide an appropriate minimum support value to start and user need to tune this minimum support value by running the algorithm again and again. Secondly, generated frequent itemsets are mostly numerous and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013